A Grid-Based Distributed SVM Data Mining Algorithm
نویسندگان
چکیده
Distribution of data and manipulation allows for solving larger problems and executing applications that are distributed in nature. In this paper we present a grid-based distributed Support Vector Machine (SVM) algorithm. The Grid is a distributed computing infrastructure that enables coordinated resource sharing within dynamic organizations consisting of individuals, in situations and resources. Grid environments can be used both for compute intensive tasks and data intensive applications as they offer resources, services, and data access mechanisms. Data mining algorithms and knowledge discovery processes are both compute and data intensive, therefore the Grid can offer a computing and data management infrastructure for supporting decentralized and parallel data analysis. The SVM algorithm is implemented in C and MPI.
منابع مشابه
Dynamic Replication based on Firefly Algorithm in Data Grid
In data grid, using reservation is accepted to provide scheduling and service quality. Users need to have an access to the stored data in geographical environment, which can be solved by using replication, and an action taken to reach certainty. As a result, users are directed toward the nearest version to access information. The most important point is to know in which sites and distributed sy...
متن کاملEmpowering Scientific Discovery by Distributed Data Mining on the Grid Infrastructure
The grid-based computing paradigm has attracted much attention in recent years. Computational Grids focus on methods for handling compute intensive tasks while Data Grids are geared towards dataintensive computing. This dissertation considers research in grid-based distributed data mining. While architectures for mining on the grid have already been proposed, the inherently distributed, heterog...
متن کاملGrid - based Distributed Data Mining Systems , Algorithms and Services ∗
Distribution of data and computation allows for solving larger problems and execute applications that are distributed in nature. The Grid is a distributed computing infrastructure that enables coordinated resource sharing within dynamic organizations consisting of individuals, institutions, and resources. The Grid extends the distributed and parallel computing paradigms allowing resource negoti...
متن کاملEmpowering Scientific Discovery by Distributed Data Mining on a Grid Infrastructure
The grid-based computing paradigm has attracted much attention in recent years. The sharing of distributed computing resources (such as software, hardware, data, sensors, etc) is an important aspect of grid computing. Computational Grids focus on methods for handling compute intensive tasks while Data Grids are geared towards data-intensive computing. Grid-based computing has been put to use in...
متن کاملPrivacy Preserving Data Mining For Horizontally Distributed Medical Data Analysis
To build reliable prediction models and identify useful patterns, assembling data sets from databases maintained by different sources such as hospitals becomes increasingly common; however, it might divulge sensitive information about individuals and thus leads to increased concerns about privacy, which in turn prevents different parties from sharing information. Privacy Preserving Distributed ...
متن کامل